Efficient processing of spatial joins with DOT-based indexing
نویسندگان
چکیده
0020-0255/$ see front matter 2009 Elsevier Inc doi:10.1016/j.ins.2009.11.029 * Corresponding author. Tel./fax: +82 2 2220 456 E-mail addresses: [email protected] (H. B (S. Park), [email protected] (S.-W. Kim). A spatial join is a query that searches for a set of object pairs satisfying a given spatial relationship from a database. It is one of the most costly queries, and thus requires an efficient processing algorithm that fully exploits the features of the underlying spatial indexes. In our earlier work, we devised a fairly effective algorithm for processing spatial joins with double transformation (DOT) indexing, which is one of several spatial indexing schemes. However, the algorithm is restricted to only the one-dimensional cases. In this paper, we extend the algorithm for the two-dimensional cases, which are general in Geographic Information Systems (GIS) applications. We first extend DOT to two-dimensional original space. Next, we propose an efficient algorithm for processing range queries using extended DOT. This algorithm employs the quarter division technique and the tri-quarter division technique devised by analyzing the regularity of the space-filling curve used in DOT. This greatly reduces the number of space transformation operations. We then propose a novel spatial join algorithm based on this range query processing algorithm. In processing a spatial join, we determine the access order of disk pages so that we can minimize the number of disk accesses. We show the superiority of the proposed method by extensive experiments using data sets of various distributions and sizes. The experimental results reveal that the proposed method improves the performance of spatial join processing up to three times in comparison with the widely-used R-tree-based spatial join method. 2009 Elsevier Inc. All rights reserved.
منابع مشابه
ISP: Large-Scale In-memory Spatial Data Processing System (Demo Paper)
Huge amount of spatial data such as GPS locations is being generated everyday, which brings big challenges of efficient spatial data processing. Many existing big spatial data processing techniques are mostly based on disk-resident systems. They have not fully taken advantages of modern hardware, such as large main memory capacities and multi-core processors. In this paper, we demonstrate our I...
متن کاملHigh Dimensional Similarity Joins: Algorithms and Performance Evaluation
ÐCurrent data repositories include a variety of data types, including audio, images, and time series. State-of-the-art techniques for indexing such data and doing query processing rely on a transformation of data elements into points in a multidimensional feature space. Indexing and query processing then take place in the feature space. In this paper, we study algorithms for finding relationshi...
متن کاملEfficient Temporal Join Processing Using Indices
We examine the problem of processing temporal joins in the presence of indexing schemes. Previous work on temporal joins has concentrated on non-indexed relations which were fully scanned. Given the large data volumes created by the ever increasing time dimension, sequential scanning is prohibitive. This is especially true when the temporal join involves only parts of the joining relations (e.g...
متن کاملAn interactive framework for spatial joins: a statistical approach to data analysis in GIS
Many Geographic Information Systems (GIS) handle a large volume of geospatial data. Spatial joins over two or more geospatial datasets are very common operations in GIS for data analysis and decision support. However, evaluating spatial joins can be very time intensive due to the size of datasets. In this paper, we propose an interactive framework that provides faster approximate answers of spa...
متن کاملCascading map-side joins over HBase for scalable join processing
One of the major challenges in large-scale data processing with MapReduce is the smart computation of joins. Since Semantic Web datasets published in RDF have increased rapidly over the last few years, scalable join techniques become an important issue for SPARQL query processing as well. In this paper, we introduce the Map-Side Index Nested Loop Join (MAPSIN join) which combines scalable index...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Sci.
دوره 180 شماره
صفحات -
تاریخ انتشار 2010